A Scalable Clustering Method for Categorical Sequences

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Hierarchical Clustering Method for Sequences of Categorical Values

Data clustering methods have many applications in the area of data mining. Traditional clustering algorithms deal with quantitative or categorical data points. However, there exist many important databases that store categorical data sequences, where significant knowledge is hidden behind sequential dependencies between the data. In this paper we introduce a problem of clustering categorical da...

متن کامل

Clustering From Categorical Data Sequences

The three-parameter cluster model is a combinatorial stochastic process that generates categorical response sequences by randomly perturbing a fixed clustering parameter. This clear relationship between the observed data and the underlying clustering is particularly attractive in cluster analysis, in which supervised learning is a common goal and missing data is a familiar issue. The model is w...

متن کامل

Clustering Sequences of Categorical Values

Conceptual clustering is a discovery process that groups a set of data in the way that the intra-cluster similarity is maximized and the inter-cluster similarity is minimized. Traditional clustering algorithms employ some measure of distance between data points in n-dimensional space. However, not all data types can be represented in a metric space, therefore no natural distance function is ava...

متن کامل

A scalable algorithm for clustering protein sequences

The enormous growth of public sequence databases and continuing addition of fully sequenced genomes has created many challenges in developing novel and scalable computational techniques for searching, comparing, and analyzing these databases. Over the years, many methods have been developed for clustering proteins according to their sequence similarity. However, most of these methods tend to ha...

متن کامل

LIMBO: Scalable Clustering of Categorical Data

Clustering is a problem of great practical importance in numerous applications. The problem of clustering becomes more challenging when the data is categorical, that is, when there is no inherent distance measure between data values. We introduce LIMBO, a scalable hierarchical categorical clustering algorithm that builds on the Information Bottleneck (IB) framework for quantifying the relevant ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Korean Institute of Intelligent Systems

سال: 2004

ISSN: 1976-9172

DOI: 10.5391/jkiis.2004.14.2.136